Skip to content

CLDSRV-863: Checksums for PutObject and UploadPart#6105

Closed
leif-scality wants to merge 7 commits intodevelopment/9.4from
improvement/CLDSRV-863-checksums-put-object-part
Closed

CLDSRV-863: Checksums for PutObject and UploadPart#6105
leif-scality wants to merge 7 commits intodevelopment/9.4from
improvement/CLDSRV-863-checksums-put-object-part

Conversation

@leif-scality
Copy link
Copy Markdown
Contributor

No description provided.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Mar 12, 2026

Hello leif-scality,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Mar 12, 2026

Incorrect fix version

The Fix Version/s in issue CLDSRV-863 contains:

  • None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

  • 9.4.0

Please check the Fix Version/s of CLDSRV-863, or the target
branch of this pull request.


const { unsupportedSignatureChecksums, supportedSignatureChecksums } = require('../../../../constants');

// FIXME: merge this
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FIXME: merge this comment and // Why do we need this check... are dev notes that should be resolved or removed before merging.

— Claude Code

const vault = require('../vault');
const constructChunkStringToSign = require('./constructChunkStringToSign');

// Do we use this one or vaults?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev note // Do we use this one or vaults? should be resolved and removed before merging.

— Claude Code

}
const value = Buffer.alloc(parsedContentLength);
const cbOnce = jsutil.once(callback);
// TODO
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The // TODO comment here indicates veeam/utils.js still uses the old prepareStream which is no longer wired to checksum validation. This should be addressed in this PR or tracked explicitly.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 12, 2026

  • Race condition in ChecksumTransform: validateChecksum() is called synchronously in storeObject.js after data.put, but _flush (which computes the digest) is async. If _flush hasn't resolved before the callback fires, this.digest will be undefined and validation will silently pass or give wrong results.
    • Move validation into _flush and emit an error event, or wait for the stream's finish event before calling validateChecksum().
  • Error type mismatch in storeObject.js: prepareStream2 returns errors.InvalidArgument (Arsenal error) in the V4 streaming case, but arsenalErrorFromChecksumError() expects a ChecksumError string enum. This will fall through to the default case returning BadDigest instead of InvalidArgument.
    • Return a ChecksumError enum value from prepareStream2, or handle Arsenal errors separately in the caller.
  • Leftover debug artifacts: Multiple commented-out console.log statements across createAndStoreObject.js, prepareStream.js, and commented-out imports in storeObject.js.
    • Remove all before merging.
  • Unresolved dev notes: // FIXME: merge this in validateChecksumHeaders.js, // Do we use this one or vaults? in V4Transform.js, // TODO in veeam/utils.js, // TODO: test AWS in trailingChecksumTransform.js.
    • Resolve or remove before merging.
  • ChecksumTransform._transform callback: passes 3 args callback(null, input, encoding) but Node.js Transform callback only accepts (error, data).
    • Remove the extra encoding argument.
  • TrailingChecksumTransform trailer parsing: bytes after the trailer \r\n in the same chunk are silently dropped due to the break statement.
    • Verify this matches AWS behavior or handle leftover bytes.

Review by Claude Code

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 12, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.65%. Comparing base (8c7babd) to head (2a22f09).
✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

Files with missing lines Coverage Δ
lib/api/apiUtils/integrity/validateChecksums.js 98.75% <100.00%> (+0.79%) ⬆️
lib/api/apiUtils/object/prepareStream.js 100.00% <100.00%> (ø)
lib/api/apiUtils/object/storeObject.js 100.00% <100.00%> (+10.34%) ⬆️
lib/auth/streamingV4/ChecksumTransform.js 100.00% <100.00%> (ø)
lib/auth/streamingV4/trailingChecksumTransform.js 100.00% <100.00%> (+7.54%) ⬆️
lib/services.js 86.96% <100.00%> (+0.06%) ⬆️

... and 2 files with indirect coverage changes

@@                 Coverage Diff                 @@
##           development/9.4    #6105      +/-   ##
===================================================
+ Coverage            84.38%   84.65%   +0.27%     
===================================================
  Files                  206      207       +1     
  Lines                13329    13518     +189     
===================================================
+ Hits                 11248    11444     +196     
+ Misses                2081     2074       -7     
Flag Coverage Δ
file-ft-tests 68.51% <86.54%> (+0.25%) ⬆️
kmip-ft-tests 28.08% <34.97%> (+0.13%) ⬆️
mongo-v0-ft-tests 69.69% <86.54%> (+0.24%) ⬆️
mongo-v1-ft-tests 69.71% <86.54%> (+0.14%) ⬆️
multiple-backend 35.15% <38.11%> (+0.03%) ⬆️
sur-tests 35.57% <30.94%> (-0.93%) ⬇️
sur-tests-inflights 37.39% <30.94%> (-0.16%) ⬇️
unit 70.59% <100.00%> (+0.57%) ⬆️
utapi-v2-tests 34.43% <37.21%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.


const { unsupportedSignatureChecksums, supportedSignatureChecksums } = require('../../../../constants');

// FIXME: merge this
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug comments left in the code: // FIXME: merge this and // Why do we need this check if we have unsupportedSignatureChecksums? appear to be development notes that should be resolved or removed before merging.

— Claude Code

const vault = require('../vault');
const constructChunkStringToSign = require('./constructChunkStringToSign');

// Do we use this one or vaults?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment // Do we use this one or vaults? looks like a development note that should be resolved and removed before merging.

— Claude Code

}
const value = Buffer.alloc(parsedContentLength);
const cbOnce = jsutil.once(callback);
// TODO
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare // TODO comment with no description. Either add context about what needs to be done (presumably migrating to prepareStream2) or remove if not actionable.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 12, 2026

  • Race condition in checksum validation (storeObject.js:81): validateChecksum() is called synchronously but ChecksumTransform._flush() is async for crc64nvme. The digest may not be computed yet when validation runs.
    - Await a finish event on the checksumed stream, or make validateChecksum() async and await the digest.
    - Missing error handler on ChecksumTransform (prepareStream.js:88): In the STREAMING-AWS4-HMAC-SHA256-PAYLOAD case, ChecksumTransform has no error listener - a _flush rejection will be unhandled.
    - Add checksumedStream.on('error', errCb).
    - Orphaned data on checksum failure (storeObject.js:84): When checksum validation fails after data.put succeeds, the stored data is never cleaned up.
    - Delete stored data via data.batchDelete before returning the error, as checkHashMatchMD5 does for MD5 mismatches.
    - Development notes left in code: FIXME/TODO/question comments in validateChecksumHeaders.js, V4Transform.js, and veeam/utils.js should be resolved or removed.
    - Duplicate switch cases (prepareStream.js:108-125): UNSIGNED-PAYLOAD and default are identical - combine into a single default case.

    Review by Claude Code

// If the x-amz-trailer header is present the request is using one of the
// trailing checksum algorithms, which are not supported.
if (headers['x-amz-trailer'] !== undefined &&
if (headers['x-amz-trailer'] !== undefined && // Why do we need this check if we have unsupportedSignatureChecksums?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are leftover TODO/question comments that should not be merged:
- Line 8: // Why do we need this check if we have unsupportedSignatureChecksums?

This reads as a development note, not a permanent comment. Either resolve the question or remove it before merging.

— Claude Code

const vault = require('../vault');
const constructChunkStringToSign = require('./constructChunkStringToSign');

// Do we use this one or vaults?
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover development comment: // Do we use this one or vaults? — resolve or remove before merging.

— Claude Code

}
const value = Buffer.alloc(parsedContentLength);
const cbOnce = jsutil.once(callback);
// TODO
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare // TODO with no description. This call still uses the old prepareStream which doesn't do checksum validation. Either migrate this to prepareStream2 or add a clear explanation of why it's deferred.

— Claude Code

if (xAmzChecksumCnt !== 0) {
return {
error: ChecksumError.TrailerAndChecksum,
details: { trailer: headers['x-amz-trailer'], checksum: checksumHeaders },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no checksum headers are present, this returns crc64nvme as default. This causes ChecksumTransform to compute a crc64nvme hash over the entire request body for every PutObject/UploadPart, even when no checksum was requested. The digest is computed but never validated (since expected is undefined). If this is intentional (e.g. to store the computed checksum in metadata later), a comment would help clarify the intent.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

  • lib/services.js:685 - err.is.NoSuchKey can crash with TypeError if the error is not an Arsenal error.
    • Add defensive check
  • lib/api/apiUtils/object/prepareStream.js:128 - prepareStream and stripTrailingChecksumStream are now dead code.
    • Remove unused exports or keep them only if needed by external consumers
  • lib/api/apiUtils/integrity/validateChecksums.js:207 - Default crc64nvme computation for all requests even without checksum headers.
    • Clarify intent with a comment, or skip computation when no checksum is requested

Review by Claude Code

method: 'prepareStream',
streamingV4ParamsType: typeof streamingV4Params,
streamingV4Params,
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging streamingV4Params at error level leaks sensitive credentials (accessKey, signatureFromRequest) into production logs. Log only safe fields like the type, not the full object.

— Claude Code

});
return { error: errors.InvalidArgument, stream: null };
}
const v4Transform = new V4Transform(streamingV4Params, log, errCb);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the STREAMING-AWS4-HMAC-SHA256-PAYLOAD case, there is no error event handler on v4Transform. Since .pipe() does not propagate errors, if v4Transform emits an error event (beyond what it routes through errCb internally), the error will be unhandled and checksumedStream will hang. Consider adding v4Transform.on('error', errCb) for consistency with the other transform streams in this function.

— Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

V4Transform never calls callback(err) and never emits 'error' — errors bypass Node's stream error mechanism entirely and go straight to errCb. So there's no unhandled error event
to guard against, and adding v4Transform.on('error', errCb) would be dead code.

TrailingChecksumTransform and ChecksumTransform DO need the on('error', errCb) handlers because they call callback(err) in _transform/_flush, which causes Node streams to emit
'error'. The asymmetry is intentional given the different error-routing designs.

};
}

if (this.trailerChecksumName !== `x-amz-checksum-${this.algoName}`) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation order checks isValidDigest(expected) before verifying the trailer name matches the expected algorithm. If the trailer name does not match (e.g. x-amz-checksum-crc32 sent when crc64nvme was expected), the digest is validated against the wrong algorithm format rules, potentially returning TrailerChecksumMalformed instead of TrailerAlgoMismatch. Consider swapping the order: check trailerChecksumName match first, then validate the digest format.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

  • Security: credential leak in logs (prepareStream.js:47) - streamingV4Params is logged at error level, exposing accessKey and signatureFromRequest in production logs. Remove the full object from the log call.
  • Missing error handler on v4Transform (prepareStream.js:50) - In the STREAMING-AWS4-HMAC-SHA256-PAYLOAD case, v4Transform has no error event listener. Since .pipe() does not propagate errors, an unhandled error on v4Transform would cause checksumedStream to hang. Add v4Transform.on('error', errCb).
  • Trailer validation order (ChecksumTransform.js:42) - isValidDigest is checked before the trailer name match. A name mismatch could return TrailerChecksumMalformed instead of TrailerAlgoMismatch. Swap the order so name is checked first.

Review by Claude Code

checksumedStream.stream.removeListener('finish', onFinish);
log.error('checksum stream error after data.put', { error: err });
cbOnce(err);
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When ChecksumTransform emits an error after data.put has already stored the object (e.g. _flush fails to compute the digest), the stored data is not cleaned up. Compare with the checksum-mismatch path at line 85 which correctly calls data.batchDelete. This handler should do the same to avoid orphaned data blobs.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

  • storeObject.js:108-112: onError handler after data.put does not clean up stored data, unlike the checksum-mismatch path at line 85. This can leave orphaned data blobs when ChecksumTransform emits an error.
    • Add data.batchDelete call in onError to match the cleanup pattern used for checksum mismatches.

Review by Claude Code

stream.pipe(checksumedStream);
return { error: null, stream: checksumedStream };
}
case 'UNSIGNED-PAYLOAD': // Fallthrough
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default case now also handles STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER (V4 signed + trailing checksum). With the new checksum logic, getChecksumDataFromHeaders would set isTrailer: true (from the x-amz-trailer header), but the chunked+signed body is piped directly into ChecksumTransform without V4 decoding or chunk-size stripping. The checksum would be computed over the raw wire format and validation would fail with TrailerMissing.

This is a pre-existing gap (old code also did not handle this content-sha256 value), but now it is more relevant since AWS SDK v3 can use this mode when both signing and checksums are enabled. Consider adding a dedicated case that chains V4Transform -> TrailingChecksumTransform -> ChecksumTransform, or returning an explicit NotImplemented error.

— Claude Code

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateChecksumHeaders rejects STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM — solid implementation of streaming checksum validation for PutObject and UploadPart.

- One gap noted: STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER (V4 signed + trailing checksum) falls into the default case and would fail. Pre-existing gap but more impactful now. See inline comment on prepareStream.js:85.

Review by Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM

Clean implementation of checksum validation for PutObject and UploadPart. The new ChecksumTransform pipeline is well-structured, stream lifecycle management (waiting for writableFinished before validating) is correct, and the defensive null-safe err.is fix in services.js is a good catch.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from 2e261da to b3e28bd Compare March 16, 2026 15:26
@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM

The checksum pipeline (ChecksumTransform wrapping V4Transform / TrailingChecksumTransform / raw stream) is well-structured. The trailer parsing, async digest handling in _flush, and post-data.put validation with cleanup on mismatch all look correct. The services.js defensive guard (err.is &&) and NoSuchKey race-condition fix are good additions. Arsenal is pinned to a tag (8.3.6). Tests are thorough.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from b3e28bd to 5dea795 Compare March 16, 2026 18:31
@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM

Solid implementation of streaming checksum validation for PutObject and UploadPart. The ChecksumTransform / TrailingChecksumTransform pipeline is well-structured, the mutable onStreamError delegate pattern in dataStore correctly upgrades cleanup behavior after data is stored, and the writableFinished / finish fallback properly handles the async digest computation window. The jsutil.once guards prevent double callbacks throughout the pipeline. The services.js race condition fix for isCompleteMPUInProgress is a good defensive change. Test coverage is comprehensive across all code paths.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from 5dea795 to 235cd59 Compare March 16, 2026 19:52
});
return { error: errors.InvalidArgument, stream: null };
}
const v4Transform = new V4Transform(streamingV4Params, log, errCb);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the STREAMING-UNSIGNED-PAYLOAD-TRAILER case (line 69), errCb is wrapped with jsutil.once() to prevent double-callback when both piped streams error. The STREAMING-AWS4-HMAC-SHA256-PAYLOAD case here does not apply the same guard — both V4Transform (via constructor) and ChecksumTransform (via .on('error', errCb)) receive the raw errCb. If both independently error, errCb would be called twice.

The current caller (storeObject.js) wraps the outer callback with jsutil.once, so this is not a bug today, but wrapping here would be more defensive and consistent with the TRAILER path.

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM — solid implementation of checksum validation for PutObject and UploadPart. The new ChecksumTransform pipeline, trailer parsing, and error cleanup paths are well-structured with thorough test coverage.

One minor note:
- prepareStream.js:51 — The STREAMING-AWS4-HMAC-SHA256-PAYLOAD path does not wrap errCb with jsutil.once(), unlike the STREAMING-UNSIGNED-PAYLOAD-TRAILER path. Not a bug today (caller protects with jsutil.once), but wrapping here would be more defensive and consistent.

Review by Claude Code

@leif-scality leif-scality force-pushed the improvement/CLDSRV-863-checksums-put-object-part branch from 235cd59 to 2a22f09 Compare March 16, 2026 21:03
// If AWS there is a mismatch, AWS returns the same error as if the algo was invalid.
if (sdkLowerAlgo !== algo) {
return { error: ChecksumError.AlgoNotSupportedSDK, details: { algorithm: sdkAlgo } };
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: "If AWS there is a mismatch" should be "If there is a mismatch".

— Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Mar 16, 2026

LGTM — solid implementation of checksum validation for PutObject and UploadPart. The stream pipeline design (V4Transform/TrailingChecksumTransform → ChecksumTransform) is clean, the mutable errCb upgrade pattern in storeObject correctly handles cleanup after data.put, and the once-guards prevent double callbacks throughout. Good test coverage across all protocol variants.

- Minor comment typo in validateChecksums.js:185 ("If AWS there is a mismatch" → "If there is a mismatch").

Review by Claude Code

@leif-scality leif-scality deleted the improvement/CLDSRV-863-checksums-put-object-part branch March 16, 2026 21:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants